{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Short walkthrough on Benchmarking in Anomalib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IJlBPLRvOYuv"
   },
   "source": [
    "## Install Anomalib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "HmOFHNPsJV4H",
    "outputId": "ad77e030-c2dd-4dbc-f4c3-882e1229999f"
   },
   "outputs": [],
   "source": [
    "!git clone https://github.com/openvinotoolkit/anomalib.git"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "N6bEfY5HOMfQ",
    "outputId": "55756dfd-955d-49e3-9f5f-f387454010fe"
   },
   "outputs": [],
   "source": [
    "% cd anomalib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "zPCaYurIPQPC",
    "outputId": "d1a40d6b-d1b6-4464-8259-b52116523229"
   },
   "outputs": [],
   "source": [
    "% pip install ."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 1000
    },
    "id": "BjtTp3wdV43q",
    "outputId": "fc110924-7bb8-42e4-f019-955a7daee03b"
   },
   "outputs": [],
   "source": [
    "! pip install -r requirements/openvino.txt"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0NJboi_7XSSN"
   },
   "source": [
    "> Note: Restart Runtime if promted by clicking the button at the end of the install logs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "y4sQOIwOUO0u"
   },
   "source": [
    "## Download and setup dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "SN3b1L115gIY",
    "outputId": "88804d1d-072b-41ba-e77e-e57a54558d2a"
   },
   "outputs": [],
   "source": [
    "!wget https://openvinotoolkit.github.io/anomalib/_downloads/3f2af1d7748194b18c2177a34c03a2c4/hazelnut_toy.zip"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "mIrbX6tRWrAM",
    "outputId": "c43009f1-f56c-435b-f034-36a25243df42"
   },
   "outputs": [],
   "source": [
    "% cd /content/anomalib/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "XDUsHlfr5wnI"
   },
   "outputs": [],
   "source": [
    "!mkdir datasets && unzip hazelnut_toy.zip -d datasets/ > /dev/null"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Mb_kkxi-URk7"
   },
   "source": [
    "## Create configuration file for training using Folder Dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following configuration file is based on the one at `anomalib/models/padim/config.yaml`. The configuration file at that location uses the MVTec dataset for training. Since we are working with a custom dataset, we will use the `Folder` datset format. In this format, the images are divided among folders such as _good_, and _colour_. Optionally, it can also contain a _mask_ folder as shown below.\n",
    "\n",
    "```bash\n",
    "hazelnut_toy\n",
    "├── colour\n",
    "│  ├── 00.jpg\n",
    "│  ├── 01.jpg\n",
    "│  ...\n",
    "├── good\n",
    "│  ├── 00.jpg\n",
    "│  ├── 01.jpg\n",
    "└── mask\n",
    "   ├── 00.jpg\n",
    "   ├── 01.jpg\n",
    "   ...\n",
    "```\n",
    "\n",
    "Each of these folders contain images belonging to their respective category. Since we are using the `hazelnut_toy` dataset, we need to change a few lines in the PaDiM configuration as shown below.\n",
    "\n",
    "```yaml\n",
    "dataset:\n",
    "  name: <name-of-the-dataset>\n",
    "  format: folder\n",
    "  path: <path/to/folder/dataset>\n",
    "  normal_dir: normal # name of the folder containing normal images.\n",
    "  abnormal_dir: abnormal # name of the folder containing abnormal images.\n",
    "  normal_test_dir: null # name of the folder containing normal test images.\n",
    "  task: segmentation # classification or segmentation\n",
    "  mask: <path/to/mask/annotations> #optional\n",
    "  extensions: null\n",
    "  split_ratio: 0.2  # ratio of the normal images that will be used to create a test split\n",
    "```\n",
    "\n",
    "The complete configuration is in the codeblock below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "id": "GNSo19XlPixN"
   },
   "outputs": [],
   "source": [
    "folder_padim = \"\"\"\n",
    "dataset:\n",
    "  name: hazelnut\n",
    "  format: folder\n",
    "  path: /content/anomalib/datasets/hazelnut_toy\n",
    "  normal_dir: good # name of the folder containing normal images.\n",
    "  abnormal_dir: colour # name of the folder containing abnormal images.\n",
    "  normal_test_dir: null # name of the folder containing normal test images.\n",
    "  task: segmentation # classification or segmentation\n",
    "  mask: /content/anomalib/datasets/hazelnut_toy/mask/colour # optional\n",
    "  extensions: null\n",
    "  split_ratio: 0.2  # ratio of the normal images that will be used to create a test split\n",
    "  image_size: 256\n",
    "  train_batch_size: 32\n",
    "  test_batch_size: 32\n",
    "  num_workers: 8\n",
    "  transform_config:\n",
    "    train: null\n",
    "    val: null\n",
    "  create_validation_set: false\n",
    "  tiling:\n",
    "    apply: false\n",
    "    tile_size: null\n",
    "    stride: null\n",
    "    remove_border_count: 0\n",
    "    use_random_tiling: False\n",
    "    random_tile_count: 16\n",
    "\n",
    "model:\n",
    "  name: padim\n",
    "  backbone: resnet18\n",
    "  layers:\n",
    "    - layer1\n",
    "    - layer2\n",
    "    - layer3\n",
    "  normalization_method: min_max # options: [none, min_max, cdf]\n",
    "\n",
    "metrics:\n",
    "  image:\n",
    "    - F1Score\n",
    "    - AUROC\n",
    "  pixel:\n",
    "    - F1Score\n",
    "    - AUROC\n",
    "  threshold:\n",
    "    image_default: 3\n",
    "    pixel_default: 3\n",
    "    adaptive: true\n",
    "\n",
    "project:\n",
    "  seed: 42\n",
    "  path: ./results\n",
    "\n",
    "logging:\n",
    "  log_images_to: [\"local\"] # options: [wandb, tensorboard, local].\n",
    "  logger: [] # options: [tensorboard, wandb, csv] or combinations.\n",
    "  log_graph: false # Logs the model graph to respective logger.\n",
    "\n",
    "optimization:\n",
    "  openvino:\n",
    "    apply: false\n",
    "\n",
    "# PL Trainer Args. Don't add extra parameter here.\n",
    "trainer:\n",
    "  accelerator: auto # <\"cpu\", \"gpu\", \"tpu\", \"ipu\", \"hpu\", \"auto\">\n",
    "  accumulate_grad_batches: 1\n",
    "  amp_backend: native\n",
    "  auto_lr_find: false\n",
    "  auto_scale_batch_size: false\n",
    "  auto_select_gpus: false\n",
    "  benchmark: false\n",
    "  check_val_every_n_epoch: 1 # Don't validate before extracting features.\n",
    "  default_root_dir: null\n",
    "  detect_anomaly: false\n",
    "  deterministic: false\n",
    "  devices: 1\n",
    "  enable_checkpointing: true\n",
    "  enable_model_summary: true\n",
    "  enable_progress_bar: true\n",
    "  fast_dev_run: false\n",
    "  gpus: null # Set automatically\n",
    "  gradient_clip_val: 0\n",
    "  ipus: null\n",
    "  limit_predict_batches: 1.0\n",
    "  limit_test_batches: 1.0\n",
    "  limit_train_batches: 1.0\n",
    "  limit_val_batches: 1.0\n",
    "  log_every_n_steps: 50\n",
    "  max_epochs: 1\n",
    "  max_steps: -1\n",
    "  max_time: null\n",
    "  min_epochs: null\n",
    "  min_steps: null\n",
    "  move_metrics_to_cpu: false\n",
    "  multiple_trainloader_mode: max_size_cycle\n",
    "  num_nodes: 1\n",
    "  num_processes: null\n",
    "  num_sanity_val_steps: 0\n",
    "  overfit_batches: 0.0\n",
    "  plugins: null\n",
    "  precision: 32\n",
    "  profiler: null\n",
    "  reload_dataloaders_every_n_epochs: 0\n",
    "  replace_sampler_ddp: true\n",
    "  sync_batchnorm: false\n",
    "  tpu_cores: null\n",
    "  track_grad_norm: -1\n",
    "  val_check_interval: 1.0 # Don't validate before extracting features.\n",
    "\"\"\"\n",
    "with open(\"config.yaml\", \"w\", encoding=\"utf8\") as f:\n",
    "    f.writelines(folder_padim)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "jpjtUHyWUXx0"
   },
   "source": [
    "## Train the model to see if it is working"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "h-GnKXC-KAi4",
    "outputId": "12aa0da9-2d02-49ba-f10e-9e1fb3630c6f"
   },
   "outputs": [],
   "source": [
    "! python ./tools/train.py --config config.yaml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Wt6BCkcoUch7"
   },
   "source": [
    "## Create Benchmarking config"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Benchmarking runs are configured using a yaml file. It contains five sections. The first one is `seed:` it is used to reproducibility across benchmarking runs. One of the uniqueness of Anomalib is that it supports deployment to edge devices using [OpenVINO](https://docs.openvino.ai/latest/index.html). This enables optimized performance and faster inference on majority of Intel devices. The benchmarking script can be used to compute OpenVINO inference throughput. To do this, `compute_openvino:` should be set to `true`.\n",
    "\n",
    "> Note: Not all models in Anomalib support OpenVINO export.\n",
    "\n",
    "The `hardware` section of the config file is used to pass the list of hardwares on which to compute the benchmarking results. If the host system has multiple GPUs, then the benchmarking computation is distributed across GPUs to speed up collection of results. By default, the results are gathered in a `csv` file but with the `writer` flag, you can also save the results to `tensorboard` and `wandb` loggers. The final section is the `grid_search` section. It has two parameters, _dataset_ and *model_name*. The _dataset_ field is used to set the values of grid search while the *model_name* section is used to pass the list of models for which the benchmark is computed.\n",
    "\n",
    "In this notebook we are working with a toy dataset, so we also need to tell the benchmarking script to use that particular dataset instead of the default `MVTec` as defined in each of the model config file. We can either update each config file or just pass a list of one value for the fields such as _format_, _path_, etc., as shown below.\n",
    "\n",
    "For more information about benchmarking, you can look at the [Anomalib Documentation](https://openvinotoolkit.github.io/anomalib/guides/benchmarking.html)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "qdRZlPl9Sh8U"
   },
   "outputs": [],
   "source": [
    "# While every attribute in dataset and model can be used to perform grid search,\n",
    "# in this example the lists with only single values are used for patching the\n",
    "# original model config\n",
    "benchmarking_params = \"\"\"seed: 42\n",
    "compute_openvino: true\n",
    "hardware:\n",
    "  - gpu\n",
    "writer: []\n",
    "grid_search:\n",
    "  dataset:\n",
    "    name: [hazelnut]\n",
    "    format: [folder]\n",
    "    path: [/content/anomalib/datasets/hazelnut_toy]\n",
    "    normal_dir: [good]\n",
    "    abnormal_dir: [colour]\n",
    "    normal_test_dir: [null]\n",
    "    task: [segmentation]\n",
    "    mask: [/content/anomalib/datasets/hazelnut_toy/mask/colour]\n",
    "    extensions: [null]\n",
    "    split_ratio: [0.2]\n",
    "    image_size: [256, 128]\n",
    "    num_workers: [4]\n",
    "  model_name:\n",
    "    - padim\n",
    "    - patchcore\n",
    "\"\"\"\n",
    "with open(\"benchmark_config.yaml\", \"w\", encoding=\"utf8\") as f:\n",
    "    f.writelines(benchmarking_params)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "ISlNVY00af7B",
    "outputId": "138840e4-8524-4e6f-c784-f9a175853a4f"
   },
   "outputs": [],
   "source": [
    "!python ./tools/benchmarking/benchmark.py --config benchmark_config.yaml"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 320
    },
    "id": "mKa-0XO_sLLy",
    "outputId": "c90e033d-61e3-4b49-f9ba-5363836d0a42"
   },
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"runs/padim_gpu.csv\")\n",
    "df.head()"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "name": "Anomalib Benchmarking",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3.8.0 ('anomalib')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.0"
  },
  "vscode": {
   "interpreter": {
    "hash": "ba723bee1893023fba5911c5ba85dac05fe2496fa0862b3e274bad096c0e1e2a"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
